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Abstract 

We give a lower bound of Cl{^yri) for the degree-4 Sum-of-Squares SDP relaxation for the 
planted clique problem. Specifically, we show that on an Erdos-Renyi graph G{n, ^), with high 
probability there is a feasible point for the degree-4 SOS relaxation of the clique problem with 
an objective value of O(-yn), so that the program cannot distinguish between a random graph 
and a random graph with a planted clique of size 0{^/n). This bound is tight. 

We build on the works of Deshpande and Montanari and Meka et ah, who give lower bounds 
of and Cl{ v}^^) respectively. We improve on their results by making a perturbation to 

the SDP solution proposed in their work, then showing that this perturbation remains PSD as 
the objective value approaches 

In an independent work, Hopkins, Kothari and Potechin [HKP15] have obtained a similar 
lower bound for the degree-4 SOS relaxation. 


*UC Berkeley, prasad@cs.berkeley.edu. Supported by NSF Career Award, NSF CCF-1407779 and the Alfred. 
P. Sloan Fellowship. 

WC Berkeley, tschraimn@cs.berkeley.edu. Supported by an NSF Graduate Research Fellowship (NSF award no 
1106400). 



1 Introduction 


In the Maximum Clique problem, the input consists of a graph G = {V, E) and the goal is to find 
the largest subset S of vertices all of which are connected to each other. The Maximum Clique 
problem is NP-hard to approximate within a n^“^-factor for all e > 0 [Has96, KhoOl]. 

Karp [Kar76] suggested an average case version of the Maximum Clique problem on random 
graphs drawn from the Erdos-R&yi distribution G(n, ^). A heuristic argument shows that an 
Erdos-Renyi graph G ~ G(n, has a clique of size (1 — o(l))logn with high probability: given 
such a graph, choose a random vertex, then choose one of its neighbors, then choose a vertex 
adjacent to both, and continue this process until there is no vertex adjacent to the clique. After 
logn steps, the probability that another vertex can be added is and so after about logn steps 
this process terminates. This heuristic argument can be made precise, and one can show that this 
greedy algorithm can find a clique of size (1 + o(l))logn in an instance of G(n, in polynomial 
time. 

Indeed, with some work it can be shown that the largest clique in an instance of G(n, 
actually has size (2 ± o(l))logn with high probability [GM75, Mat76, BE76]. But while some 
clique of size (1 ± o(I)) logn can easily be found in polynomial time (using the heuristic from the 
previous paragraph), an efficient algorithm for finding the clique of size 2 log n has been much more 
elusive. In his seminal paper on the probabilistic analysis of combinatorial algorithms, Karp asked 
whether there exists a polynomial-time algorithm for finding a clique of size (1 -|- e) logn for any 
fixed constant e > 0 [Kar76]. Despite extensive efforts, there has been no algorithmic progress on 
this question since. 

The planted clique problem is a natural variant of this problem wherein the input is promised to 
be either a graph drawn from G ~ G(n, or a graph G ~ G(n, with a clique of size k planted 
within its vertices. The goal of the algorithm is to distinguish between the two distributions. 

For k > (2 + e) log n, there is a simple quasi-polynomial time algorithm that distinguishes the 
two distributions. The algorithm simply tries all subsets of (2 -|- e)logn vertices, looking for a 
clique. For a random graph G(n, ^), there are no cliques of size (2 -|- e) log n, but there is one in the 
planted distribution. Clearly, the planted clique problem becomes easier as the planted clique’s size 
k increases. Yet there are no polynomial-time algorithms known for this problem for any k < o{y/n). 
For k = Q.{y/n), a result of Alon et al. uses random matrix theory to argue that looking at the 
spectrum of the adjacency matrix suffices to solve the decision problem [AKS98]. 

The works of [FK08, BV09] show that, if one were able to efficiently calculate the injective tensor 
norm of a certain random order-m tensor, then by extending the spectral algorithm of [AKS98] one 
would have a polynomial-time algorithm for k > . However, there is no known algorithm that 

efficiently computes the injective tensor norm of an order-m tensor; in fact computing the inective 
tensor norm is hard to approximate in the general case [HM13]. 

While algorithmic progress has been slow, there has been success in proving strong lower bounds 
for the planted clique problem within specific algorithmic frameworks. The first such bound was 
given by Jerrum, who showed that a class of Markov Chain Monte Carlo algorithms require a super¬ 
polynomial number of steps to find a clique of size (H-e) log n, for any fixed e > 0, in an instance of 
G(n, [Jer92]. Feige and Krauthgamer showed that r-levels of the Lovasz-Schriver SDP hierarchy 
are needed to find a hidden clique of size k > kl{y/n/2^) [FKOO, FK03]. Feldman et al. show 
(for the planted bipartite clique problem) that any “statistical algorithm” cannot distinguish in a 
polynomial number of queries between the random and planted cases for k < 0{^/n) [FGR'''12]. 

More recently, there has been an effort to replicate the results of [FKOO, FK03] for the Sum- 
of-Squares (or SOS) hierarchy, a more powerful SDP hierarchy. The recent work of [MPW15] 
achieves a D(n^/^”)-lower bound for r-rounds of the SOS hierarchy, by demonstrating a feasible 
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solution for the level-r SDP relaxation with a large enough objective value in the random case. 
The work of [DM15a] achieves a sharper lower bound for the Meka-Potechin-Wigderson 

SDP solution, but only for r = 2 rounds; a counterexample of Kelner (which may be found in 
[Bar 14]) demonstrates that the analysis of [DM15a] is tight for the integrality gap instance of 
[DM15a, MPW15] within logarithmic factors. 

This line of work brings to the fore the question: can a d = 0(l)-degree SOS relaxation solve 
the planted clique problem for some k < y/nl While lower bounds are known for Lovasz-Schrijver 
SDP relaxations for planted clique [FKOO, FK03], SOS relaxations can in general be much more 
powerful than Lovasz-Schrijver relaxations. For example, while there are instances of unique games 
that are hard for poly (log log n)-rounds of the Lovasz-Schrijver SDP hierarchy [KS09, RS09], recent 
work has shown that these instances are solved by degree-8 SOS hierarchy [BBH'''12]. 

Moreover, even the degree-4 SOS relaxation proves to be surprisingly powerful in a few appli¬ 
cations: 

• First, the work of Barak et al. [BBH"*'12] shows that a degree 4 SOS relaxation can certify 
2 — to — 4 hypercontractivity of low degree polynomials over the hypercube. This argument 
is the reason that hard instances for Lovasz-Schriver and other SDP hierarchies constructed 
via the noisy hypercube gadgets are easily refuted by the SOS hierarchy. 

• Second, a degree-4 SOS relaxation can certify that the 2-to-4 norm of a random subspace of 
dimension at most o{^/n) is bounded by a constant (with high probability over the choice of 
the subspace) [BBH'''12]. This average-case problem has superficial similarities to the planted 
clique problem. 

In this work, we make modest progress towards a lower bound for SOS relaxations of planted 
clique by obtaining a nearly tight lower bound for the degree-4 SOS relaxation (corresponding to 
two rounds, r = 2). More precisely, our main result is the following. 

Theorem 1.1. Suppose that G ~ G(n, ^). Then with probability 1 — 0{n~^), there exists a feasible 
solution to the SOS-SDP of degree d = 4 (r = 2) with objective value poi^gn - ^ 

Note that by the work of [AKS98], this result is tight up to logarithmic factors. In an indepen¬ 
dent work, Hopkins, Kothari and Potechin [HKPI5] have obtained a similar result. 

Our work builds heavily on previous work by Meka, Potechin and Wigderson [MPW15] and 
Deshpande and Montanari [DM15a]. Since the SDP solution constructed in these works is infeasible 
for k > we introduce a modified SDP solution with objective value D(vTi), and prove that for 
a random graph G the solution is feasible with high probability. At the parameter setting for which 
the objective value becomes H(n^/^), the SDP solutions of [DMI5a, MPWI5] violate the PSDness 
constraint, or equivalently, there exists a set of test vectors X such that x'^Mx < 0 for all x £ X. 
Our feasible SDP solution is a perturbation of their solution-we add spectral mass to the solution 
along the vectors from the set X, then enforce the linear constraints of the SDP program. 

1.1 Notation 

We use the symbol ^ to denote the PSD ordering on matrices, saying that A ^ 0 if A is PSD and 
that A^HifA — H^O. When we wish to hide constant factors for clarity, we use a < 6 to denote 
that a < G ■ b for some constant G. 

^We have made no effort to optimize logarithmic factors in this work; a more delicate analysis of the required 
logarithmic factors is certainly possible. 
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We denote by In £ the vector such that ln{i) = 1 Vi G [n], or the all-l’s vector. We denote 

the normalized version of this vector by 1„/||1„||. Further, we use Jn H"'' and Qn '=^ ii"*". 

We will drop the subscript when n is clear from context. 

In our notation, we at times closely follow the notation of [DM15a], as our paper builds on their 
results and we recycle many of their bounds. 

For convenience, we will use the shorthand n = nlogn. We will abuse notation by using (”) 
to refer to both the binomial coefficient and to the set (2) = {{a,b) | a, 6 G [n], a 7^ b}. We will 
also use the notation to refer to the union of sets U^=o iT)' Further, when we give a vector 

V G m ( 2 ), we will identify the entries of v by unordered pairs of elements of [n]. 

Throughout the paper, we will (unless otherwise stated) work with some fixed instance G of 
G(re, ^), and denote by Ai G M” the “centered” zth row of the adjacency matrix of G, with jth 
entry equal to 1 if the edge (i,j) G E, equal to —1 if the edge {i,j) 0 E, and equal to 0 for j = i. 
We will use Aij to denote the jth index of A^. 

1.2 Organization 

In Section 2, we give background material on the degree-4 SOS relaxation for the max-clique prob¬ 
lem, describe the integrality gap of Deshpande and Montanari for the planted clique problem, and 
explain the obstacle they face to reach an integrality gap value of We then describe our inte¬ 

grality gap instance, motivating our construction using the obstacle for the Deshpande-Montanari 
and Meka-Potechin-Wigderson witness, and give an overview of our proof that our integrality gap 
instance is feasible. In Section 3, we prove that our witness is PSD, completing the proof of feasibil¬ 
ity. Section 4 contains our concentration bounds for random matrices that arise within our proofs. 
In our proof, we reuse several bounds proved by Deshpande and Montanari. As far as possible, we 
restate the claims from [DM15a] as they are used; for convenience, in Appendix A, we list a few 
other claims from Deshpande and Montanari that we use in this paper. 

2 Preliminaries and Proof Overview 

In this section, we describe the degree-4 SOS relaxation for the max-clique SDP and give background 
on the Deshpande-Montanari witness. We then describe our own modified witness, and give an 
overview of the proof that our witness is feasible (the difficult part being showing that our witness 
is PSD). The full proof of feasibility is deferred to Section 3. 

2.1 Degree-4 SOS Relcixation for Max Clique 

The degree 4 = 4 SOS relaxation for the maximum clique problem is a semidefinite program whose 
variables are X G y< 2 )_ por a subset S EV with ISj < 2, the variable Xs indicates whether 

S is contained in the maximum clique. For a graph G on n vertices, the program can be described 
as follows. 


Maximize E ^(i),(i> (2'1) 

ie[n] 

whenever 5i U ^2 = S^U 
if 5i U 52 is not a clique in G 


subject to ^Si ,52 ^ [0) 1] 

Xsi,S2 = ^83,84 
^81,82 = 0 
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^ 0,0 — 1 

X hO 


It is instructive to think of the variable Xs as a pseudoexpectation of the product of indicator 
variables, or a pseudomoment: 


Xs = t 


I(i € clique) 

.ies 


Intuitively, the constraints of the SDP force the solution to behave somewhat like the moments of a 
probability distribution over integral solutions, although they needn’t correspond to the moments 
of a true distribution, hence the term pseudomoment. For more background, see e.g. [Bar 14]. 
The pseudmoment interpretation of the SDP solution motivates the choice of the witness in the 
prior work. For example, we may notice that the objective function in this view is simply the 
pseudoexpectation of the size of the planted clique, lE[^jg[^] I(i G clique)]. 

If sdpval(G, 4) denotes the optimum value of the SDP relaxation on graph G, then clearly 
sdpval(G) is at least the size of the maximum clique in G. In order to prove a lower bound for 
degree 4 SOS relaxation on G(n, ^), it is sufficient to argue that with overwhelming probability, 
sdpval(G') is significantly larger than the maximum clique on a random graph. This amounts to 
exhibiting a feasible SDP solution with large objective value, for an overwhelming fraction of graphs 
sampled from G(n, ^). Formally, we will show the following: 


Theorem 2.1 (Formal version of Theorem l.I). There exists an absolute constant c G N such that 


P |sdpval(G)> 

G~G(n,i) I 



> 1 — 0{n 


We obtain Theorem 2.1 by constructing a point, or witness, for each G ~ G{n, ^), then prov¬ 
ing that the point is feasible with high probability. We defer the description of our witness to 
Definition 2.8 and Definition 2.9, as we spend Section 2.2 and Section 2.3 motivating our construc¬ 
tion; however the curious reader may skip ahead to Definition 2.9 which does not require the 
knowledge of additional notation. 


2.2 Deshpande-Montanari Witness 


Henceforth, fix a graph G that is sampled from G(n, ^). Both the work of Meka, Potechin and 
Wigderson [MPW15] and that of Deshpande and Montanari [DMlSa] construct essentially the same 
SDP solution for the degree-4 SOS relaxation. 

This SDP solution assigns to each clique of size l,...,d, a value that depends only on its 
size (in our case, d = 4). In essence, their solution takes advantage of the independence of the 
G{n,p) instance. The motivating observation is that the variable Xs can be thought of as a 
pseudoexpectation of the indicator that S' is a subclique of the planted clique. The idea is then 
to make this pseudoexpectation of the indicator consistent with the true expectation under the 
distribution where a clique of size k is planted uniformly at random within the instance of G{n,p). 
Thus, every vertex is in the clique “with uniform probability:” 

k 

IE[X{j}] Ri E[I(i is in planted clique)] = —. 

Then, the same principle is applied to edges, traingles, and 4-cliques, so that 


]E[X5] Ri I(S is clique) • E[I(S is in planted clique)] 
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This is the general idea of the SDP solution of [DM15a]. More formally, the SDP solution in 
[DM15a] is specified by four parameters a = as, 

M{G,a) = a|Aus| ' Gaub , 

where for a set of vertices A C V, Ga is the indicator that the subgraph induced on ^ is a clique. 
The parameters {ai}ig[ 4 ] determine the value of the objective function, and the feasibility of the 
solution. As a convention, we will define ao = 1. 

It is easy to check that the solution M{G,a) satisfies all the linear constraints of the SOS 
program (2.1), since it assigns non-zero values only to cliques in G. The key difficulty is in showing 
that the matrix M is PSD for an appropriate choice of parameters a. 

In order to show that M (G, a) ^ 0, it is sufficient to show that N (G, a) ^ 0 where, 

Na,b = aiAuB\ ■ Gij, 

i&A\B,j&B\A 

where Qij is the indicator for the presence of the edge In words, N is the matrix where 

the entry {a, b, c, d} is proportional not to the indicator of whether {a, b, c, d} is a clique, but to 
the indicator of whether G has as a subgraph the bipartite clique with bipartitions {a, b} and 
{c, d}. It is easy to see that the matrix M is obtained by dropping from N the rows and columns 
corresponding {a, b} G ( 2 ) where (a, b) ^ E{G). Hence ^ 0 M ^ 0. 

Notice that is a random matrix whose entries depend on the edges in the random graph G. 
At the risk of over-simplification, the approach of both the previous works [MPW15] and [DM15a] 
can be broadly summarized as follows: 

1. (Expectation) Show that the expected matrix E[A^] has sufficiently large positive eigenvalues. 

2. (Concentration) Show that with high probability over the choice of G, the noise matrix 
N — E[A'] has bounded eigenvalues, so as to ensure that N = E[Ai] -|- {N — E[AI]) ^ 0 

Here we will sketch a few key details of the argument in [DM15a]. The matrix N G \< 2 > 

can be decomposed into blocks {A^afe}a,be{o,i, 2 } where A^a.fe £ M(ti)^(6). Deshpande and Montanari 
use the Schur complements to reduce the problem of proving that A^ ^ 0 to facts about the blocks 
{-^ab}a,fee{o,i, 2 }- Specifically, they show the following lemma: 

Lemma 2.2. Let A G \< 2 > matrix defined so that Aa,b = oi\A\ot\B\- Fora,b G {0,1,2}, 

let Ha^b be the submatrix of N{G, a) — A corresponding to monomials Xs with IS"! = a + b. Then 
N{G, a) is PSD if and only if 


Hii h 0 , ( 2 . 2 ) 

H 22 - >1 0 (2.3) 

The most significant challenge is to argue that (2.3) holds with high probability. In fact, the in¬ 
equality only holds for the Deshpande-Montanari SDP solution with high probability for parameters 
OL for which the objective value is o(n^/^). 

Expected matrix. The expected matrix E[fl 22 ] is symmetric with respect to permutations of the 
vertices. It forms an association scheme (see [MPW15, DM15a]), by virtue of which its eigenvalues 
and eigenspaces are well understood. In particular, the following proposition in [DMlSa] is an 
immediate consequence of the theory of association schemes. 
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Proposition 2.3 (Proposition 4.16 in [DM15a]). E[ff 22 ] has three eigenspaces, Vq, Pi, V 2 such that 


— -^ono + + A2n2, 

where Ho, Hi, 112 o,re the projections to the spaces Po)Pi)P 2 respectively. The eigenvalues are given 

by, 


. / ^ def . (n — 2)(n — 3) n(n — 1) 9 

Ao(ll) = 02 + (^ ~ 2)03 H-—-- 04---02 


, / ^ def (n 4) 
Ai(aj — 02 H-^—as 

, / N def 04 

A2(aj — 02 — 03 + — 
lb 

Further the eigenspaces are given by, 


32 

(n-3) 


16 


-04 


and 


Vq = span{l}, 

Vi = span{u I {u, 1) = 0, Uij = Xi + xj for x G M”}, 
P 2 = \ (Po U Pi), 


(2.4) 

(2.5) 

( 2 . 6 ) 


where we have usedMS<'^l to denote the space of vectors of real numbers indexed by subsets of n of 
size at most 2. 


Deviation from Expectation. Given the lower bound on eigenvalues of the expected matrix 
K[H 22 ], the next step would be to bound the spectral norm of the noise H 22 — ^[H 22 ]- However, 
since the eigenspaces oiM[H 22 ] are stratified (for the given o), with one large eigenvalue and several 
much smaller eigenvalues, standard matrix concentration does not suffice to give tight bounds. To 
overcome this, Deshpande and Montanari split H 22 and Hj 2 Hn H 12 along the eigenspaces of E[ff 22 ]- 
More precisely, let us split H 22 — E[ff 22 ] as 


H 22 - E[if 22 ] =Q + K 

where Q includes all multilinear entries, and K includes all non-multilinear entries, i.e., entries 
K{A, B) where n H 7 ^ 0. Formally, 

otherwise ' 

The spectral norm of the matrix Q over the eigenspaces Vq, Pi, V 2 is carefully bounded in [DM15a]. 

Lemma 2.4. (Proposition 4 . 20 , 4-25 in [DM 15a]) With probability at least 1 — 0{n ^), all of the 
following bounds hold: 

V(a, 6 ) g{0,1,2}2 ^2.7) 

( 2 . 8 ) 

(2.9) 


naQHbll < a4n3/2 

n 2 Qn 2 || < a 4 n 
ll^ll < a3ni/' 
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Proposition 2.3 and Lemma 2.4 are sufficient to conclude that H 22 ^ 0 for parameter choices of 
a that correspond to planted clique of size up to More precisely, to argue that with high 

probability H 22 ^ 0 , it is sufficient to argue that, E[i/ 22 ] ^ E[iL 22 ] — H 22 , i-e., 


Ao 

0 

o' 


■||noQno|| 

linoQniii 

linognair 

0 

At 

0 

0 

liniguoll 

liniQUiii 

I|nign2|| 

_0 

0 

A2_ 


l|n2gno|| 

l|n2Qni|| 

linsgusii 


Deshpande and Montanari fix ai = k, 02 = 4k^, 03 = 8k^ and 04 = 512ac'^ for a parameter k. 
Using Proposition 2.3 and Lemma 2.4, the above matrix inequality becomes, 



0 

O' 


-^ 3/2 

„ 3/2 

^ 3 / 2 - 

0 


0 

0 

^ 3/2 

„ 3/2 

^ 3/2 

0 

0 



^ 3/2 

„ 3/2 

n 


( 2 . 10 ) 


which can be shown to hold for k <C Eventually, it is necessary to show (2.3), which is 

stronger than H 22 ^ 0. This is again achieved by showing bounds on the spectra of and H 12 . 
We refer the reader to [DM15a] for more details of the arguments. 


2.3 Problematic Subspace 

The SDP solution described above ceases to be PSD at k ~ which corresponds to an objective 

value of 0(n^/^). The specific obstruction to H 22 h 0 arises out of (2.10). More precisely, the 
bottom 2 x 2 principal minor which yields the constraint, 


At 

I|nign2ir 




.I|n2gni|| 

A 2 



K? 


forcing k <C It is clear that the problematic vectors x E m( 2 ) for which x'^H 22 X < 0 

are precisely those for which x^n2Qnix < 0 and \x'^Yi2Q^ix\ is large, i.e., n2X aligns with the 
subspace QiVi 0 Vq)- 

In fact, we identify a specific subspace W that is problematic for the [DM15a] solution. To 
describe the subspace, let us fix some notation. Define the random variable Aij to be —1 if (i, j) 0 E, 
and 0l otherwise. We follow the convention that An = 0. 

Lemma 2.5. Let the vectors ai,...,an E he defined so that afik^L) Ai^An, and let 

W spanjoi,..., a„}. Then with probability at least 1 — 0(n“^), 

IIII 2 Q — Il2nvK<3|| < aifi 

Proof. This is an immediate observation from the various matrix norm bounds in [DM15a] 
(specifically Lemma A.2, Lemma A.3 and Observation A. 5). We defer the detailed proof to 
Appendix A.l. □ 

Since ||n2(5ni|| ^ a^n, the above lemma implies that all the vectors with large singular values 
for Q are within the subspace W. Furthermore, we will show the following lemma which clearly 
articulates that W is the sole obstruction to H 22 ^ 0. 

^ Here, we have identified the matrices E[i422] and E[i422] — H 22 , which are matrices in with the 3x3 

matrices corresponding to diagonalizing E[i422] according to the three eigenspaces Vb, Ui, V 2 of the expectation E[H 22 ]. 
This is analagous to decomposing any quadratic form H 22 V into v^fHo + Hi + n 2 )JT 22 (no + Hi + Tl 2 )v. 
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Lemma 2.6. Suppose a E Ml satisfies 


min(Ao(a), Ai(a), A 2 (a)) > , (2-11) 

Ao(a) > Ai(a) > , (2-12) 

■^ 2 ( 0 ) ^ Oi^n (2-13) 

then with probability 1 — 0{n~^), 

H22h]- nH22] - • U 2 UWII 2 . 

4 Ai 

Proof. Fix 9 = Recall that H 22 — E[F^ 22 ] = Q + K. We can write the matrix 

H 22 + 9 ■ Il2llw^2 = -Bpi/J- + Bw + Bk + ^ E[F^ 22 ] , 

where 


Bwr — - E[Ff 22 ] + 


IToQIIo 

IIiQIIo 

112(1 — Iiw)QBQ 


IToQlIi 

niQRi 

n2(/-ni^)Qni 


BqQ{I — Bw)B 2 
Il^Q{I-Yiw)B2 ^ 
n 2 Qn 2 


and 


Bw 


\m22] + 


0 

0 

Ii2liwQBQ 


0 

0 

H2B.\YQIii 


BQQIiwB2 
IllQIlwIl2 
9 ■ 11211^^112 


and Bk = K + \ ^[^^ 22 ]- 

It is sufficient to show that B^^±,Bw and Bk ^ 0. Using Proposition 2.3 and (2.9), Bk ^ 
(Ao - a!3n^'^^)no + (Ai - + (A 2 — 0371^/^)112 ^ 0 when condidition (2.11) holds. Using 

Proposition 2.3, Lemma 2.4 and Lemma 2.5 we can write. 


^0 

0 

O' 


-3/2 

7/3/2 

n 

0 

At 

0 

— OH • 

7/3/2 

7/3/2 

n 

_0 

0 

A 2 _ 


n 

n 

n 


which is PSD given the bounds on Ai, A 2 , A 3 in conditions (2.12) and (2.13). To see this, one shows 
that all the 2x2 principal minors are PSD. 

On the other hand, for any x E M^^)^ we can write 


x^Bwx > AollUoxf + ^||nvun 2 xf - 2||g||||nH/n 2 x||||nox|| 
+ AiiiUixf + ^\\Uwn2xf - 2||Q||||nH/n2x||||nix|| 


Now we will appeal to the fact that a quadratic r{p, q) = ap^ + 2bpq + cq^ > 0 for all p, E M if 
6 ^ < 4ac and a > 0. Since 9Xi,9\o > 16||g|p by condition (2.12), it is easily seen that the above 
quadratic form is always non-negative, implying that Bw ^ 0 . □ 

®Here again we diagonalize according to the subspaces Vo, Vi, V 2 , as in (2.10) 













An immediate corollary of the proof of the above lemma is the following. 
Corollary 2.7. Under the hypothesis of Lemma 2.6, with probability 1 — 0{n~'^) 



The above corollary is a consequence of the fact that H 22 — K = By/ + ^ E[ff 22 ]- 

2.4 The Corrected Witness 


Suppose we have an unconstrained matrix M that we wish to modify as little as possible so as to 
ensure M ^ 0. Given a test vector w so that w'^Mw < 0, the natural update to make is to take 
M' = M + (3 ■ for a suitably chosen jd. This would suggest creating a new SDP solution by 


setting H!22 = H 22 + P Eie[n] ■ 

Unfortunately, the SOS SDP relaxation has certain hard constraints, namely that the non-clique 
entries are fixed at zero. Moreover, the entry Xs.,^,S 2 must depend only on Si U52. Setting the SDP 


solution matrix to H 22 + (3 X]ie[n] would almost certainly violate both these constraints. It is 
thus natural to consider multiplicative updates to the entries of the matrix which clearly preserve 
the zero entries of the matrix. 


Specifically, the idea would be to consider an update of the form M' = M + fdDwMDw where 
is the diagonal matrix with entries given by the vector w. If the matrix M has a significantly 
large eigenvalue along 1, i.e., M ^ Aq • il"*" + A, for some matrix A with ||A|| <C Ai, then this 


multiplicative update has a similar effect as an additive update, 

M' ^ M -|- /? • Ao • ww"^ -|- j3DyjA.Dyj , 

where the norm of the final “error” term /3Du]ADu] is relatively small. Recall that, in our setting, 
the Deshpande Montanari SDP solution matrix N does have a large eigenvalue along 1. We now 
formally describe our SDP solution, first as a matrix according to the intuition given above, and 
then as a set of pseudomoments. 

Definition 2.8 (Corrected SDP Witness, matrix view). Let ai, ... ,an G defined so that 



» / TO \ 

Define Di G be the diagonal matrix with a, on the diagonal. Define K to be the restriction 

of N(G,a) to the non-multilinear entries. Also let 


N'{G,a) = N{G,a) + /3 • ^ A (iV(G, a) - iP) A, 


iS[n] 


where (3 


/3 = -. Then our SDP witness is the matrix M', defined so that 


1 


lOO-v/nlogn 



where V is the projection that zeros out rows and columns corresponding to pairs (i,j) 0 E. 
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Definition 2.9 (Corrected SDP Witness, pseudomoments view). Let /3 = Yo(V=i 3 ^) let 
a E be a set of parameters, to be fixed later. For a subset S C [n], let ^[5"] be the graph 
induced on G by S. For any subset of at most 4 vertices S' C [n], IS"! < 4, we define 


E[X5](G,a) 


'C4(a) • ( 4 )/(2) + /5 - to S in G} ^ 4 ^ 

< C|S|(a) • (|S|)/(| 5 |) I'S’I < 3 and G[S] = clique, 

_ 0 otherwise, 


where C| 5 |(a) is some factor chosen for each |5| € {0,... ,4} depending on the choice of a, which 
we will set later to ensure that the final moments matrix is PSD. 

Proposition 2.10. For j5 = Yoa( 7 ^ki^’ “4 — h probability at least 1 — 0{n~^), the 
solution N'{G,a) does not violate any of the linear constraints of the planted clique SDP. 

Proof. First, M'(5i, S 2 ) = M{Si, S 2 ) whenever |5i U 52 | <4 so these entries satisfy the constraints 
of the SDP. If |5i U ^ 2 ! = 4 then M'{Si, S 2 ) is given by. 


M'{ 81 , 82 ) 


04 • I[S'i U 82 is a clique] 


1 + /3 ^ n 

*S[n] j£SiUS2 


Notice that M'{Si,S 2 ) is non-zero only if Si U S '2 is a clique, and it depends only on Si U 82 . 
Moreover, Eie[n] nje 5 iU 52 ^ over iid mean 0 random variables and therefore satisfies, 


P 


X] n 

iG[n] j&SiUS2 


< lOOVralogn 


> 1 - . 


A simple union bound over all subsets S'! U 52 € ( 4 ) shows that M'(5i, 52 ) E [0,1] for all of them 
with probability at least 1 — 0 {n~^). □ 


It now remains to verify that N'{G,a) F 0. We will do this by verifying the Schur complement 
conditions, as in [DM15a]. Analogous to the submatrix H 22 , one can consider the corresponding 
submatrix LI 22 ^^^e expression for H' is as follows; 

H'22 =' H22 + Y. Di{H22 + - K)Du 

i&[n] 

Here Di is the matrix with a* on the diagonal, and K is the matrix corresponding to the non- 
multilinear entries (entries corresponding to monomials like x'^XbXc), and is the all-ls matrix. 

The matrices H 12 and Hu are unchanged, and so we must simply verify that H 22 — -^^ 12 -^ 11^-^12 
and that H 22 ^ 0 . 

This concludes our proof overview. In Section 3, we verify the Schur complement conditions 
and prove our main result, and in Section 4 we give the random matrix concentration results upon 
which we rely throughout the proof. 


3 Proof of the Main Result 

In this section, we will demonstrate that H 22 ^ 0, and that H 22 ^ Hj 2 Hf^Hi 2 . This will allow us 
to conclude that our solution matrix is PSD, and therefore is a feasible point for the degree-4 SOS 
relaxation. 
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Parameters. Before we proceed further, it will be convenient to parametrize the choice of 
ai, 02 ) ct 3 and 04 . In particular, it will be useful to fix. 


oi 


def p 
~ ^ 


def 

02 = - 

n 


“3 - ^ 


6 4 

def TP 
0^4 = - TT- 


(3.1) 


for two parameters 7 , p, which we will finally fix to 7 = log^ n and p = log n. For this setting of 
parameters, the eigenvalues Aq, Ai, A 2 from Proposition 2.3 are bounded by, 


2 fi 4 S 

a4^n^ TP . 0 : 3 ^ TP 

° - 64 “ 64 ^ ~ “ 4ni/2 



7^ 

2 n 


(3.2) 


When convenient, we will also use the shorthand ci n^/^oi, C 2 no 2 , C 3 and 

def 2 

C 4 = n a 4 . 


3.1 Proving that H22 ^ 0 

Here we will make a first step towards verifying the Schur complement conditions of Lemma 2.2 by 
showing that H 22 ^ 0. Specifically, we will show the following stronger claim. 

Theorem 3.1. For (3 = 7 = log^ n, p < log“^° n, the following holds with probability 

at least 1 — 0 (n“^), 


H'22fTh22] + ^-TIw 

8 Id 

Proof. Fix 9 = By definition of H 22 , we have 

H '22 = H22 + PT ^*(^22 + J - K)Di. 

ie[n] 

Define Pw = X)ig[n] We can apply Lemma 2.6 to the H 22 term and Corollary 2.7 for H 22 — K, 

H'22 ^ \ E[i 722 ] -0 • n2nivn2 + /3 • a (\nH22] -eu2UwU2 + A- 

ie[n] ^ 2 

^ ^ie[L722] - 0 ■ n2niyn2 + ^ ^ a (yno - 6»n2nvvn2 j Di. (dropping ni,n2, J) 

^ i E[A2] - e ■ U2UWU2 + l 3 ^Pw -POY, DiIi2liwTl2Di (usingAAA = a^af/ ) 

je[n] ^ 2 


Now we will appeal to a few matrix concentration bounds that we show in Section 4. First, 
with probability 1 — 0 {n~^), the vectors {of^} are nearly orthogonal, and therefore form a well- 
conditioned basis for the subspace W. 


n 


Pw ^ • P-w- 


(see Lemma 4.3) . 


Also, the vectors {af^} have negligible projection on to the eigenspaces Vb)Pi which implies that 
with overwhelming probability. 


||Fi 2 nu/n 2 — Hwll < 


log^ n 


n 


•Id, 


(see Lemma 4.5) . 
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Finally, VF is an n dimensional space. Each DiIl 2 ^w^ 2 Di has only n non-zero singular values 
each of which is 0(1). Moreover, multiplying on the left and right by Di acts as a random linear 
transformation/ random change of basis. Intuitively, this suggests that has v? 

eigenvalues all of which are roughly 0(1). In fact, with probability 1 — 0(n~^), 

DiIi 2 ^w^ 2 Di < 0{n) ■ Ho -I- log^ n ■ Id (see Lemma 4.6) 

i 

Substituting these bounds we get, 

H'y ^E[H 22 ] + (^^ - 0^ -Uw- + f39 log^r^j ■ Id - (39 ■ 0{n) ■ Uq 


By Lemma 2.4, with probability at least 1 — 0{n ||(5|| < . Substituting this bound for 

9 = along with (3.2), finishes the proof for our choice of parameters. The details are presented 
below for completeness. 


9 ^ Y^logn 

/3Ao OL^n 


log"^ n • 7^/9 < 1 , 


Clearly Aq > Ai 


because 


I39n ^ 1 


.2;rr3 




v^logn a^n 

> A 2 and 

A2>lOO0(^/31og2n + i^^^ , 


< log^ n • < 1 


9(3 log^ n ^ a|n' 


A 2 


1 log^ n 
a^n y/n\ogn 02 


= log"* n • 7 ®p^ < 1 


□ 


3.2 Bounding singular values of H 12 

Towards bounding the eigenvalues of Deshpande and Montanari [DMlSa] observe the 

following properties of H 21 with regards to the spaces Vq) and ¥ 2 - 

Lemma 3.2 (Consequence of Propositions 4.18 and 4.27 in [DMlSa]). Let Qn G be the 

orthogonal projector to the space spanned by 1„. Let p = ^. For the matrix H 21 , we have that for 
sufficiently large n, with probability 1 — 0 {n~^), 

II E[-ff 2 l] - ^^ 21 II < 03™, 


and 


noE[i72i]Qn|| < 3n3/2«3 + 2 a 2 n '/2 noE[772i]Q;^ = 0 

Bi E[H2i]Qn = 0 llBi E[iL2i]Qi^|| < a 2 n^/^ 


II 2 E[H2l]Qn — 0 


n2E[772i]Q;^ = o. 


Unfortunately, the bound of [DMlSa] on ||E[i 72 i] — ^f 2 i|| is insufficient for our purposes, and 
we require a more fine-grained bound on the deviation from expectation. In fact, outside the 
problematic subspace W, we show that H 21 — E[L 72 i] is much better behaved. 
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Proposition 3.3. Let W = sparingj„]{a*}, and let liw be the projector to that space. With proba¬ 
bility at least 1 — the following holds for every x G m( 2 ) 

\\x^U2 {H2 i - E[i/ 2 i])f < 

Proof. Prom Lemma 3.2, we have that 

112 E[Lr 2 i] = 0 . 

Thus we may work exclusively with the difference from the expectation; for convenience, we let 
U = H 21 — E[Lf 2 i]- Fix A = {a, 6 } and c, so that |{a, 6 }| = 2. By inspection, the entry {A,c) of U 
is given by the polynomial 

Ua,c = —^{AacAi^c + {Aac + ^6c))- 
Thus, the columns of U are in VP U Vi. So we have that 

112 ?7 = 112 ( 11 ^ + + 

'^wuViW 

= n 2 nwL^, 


where the latter two terms were eliminated because Vi T V 2 and VP U Vh T span{col(f7)}. 
In Lemma 4.5, we bound UnoiLlu/Lloill < 0( ^°^ So we have that 


|x^n2nrv||^ < x^(id — noi)nrv(id — noi)x 

= ||nryx||^ — 2x^noinrvx + x^noinrylloix 

< ||nwx||^ + 2||nrvx||||noinvi/a:|| + lla^lPlinoinrylloil 


< 


2 ||nvi/x||^ + 


2 log^ n 


n 


where we have applied the inequality + 6^ > 2ab. The conclusion follows by noting that 
||x'^n 2 Lr 2 i|| = || 2 ;^n 2 nu/t/|| < ||x'^n 2 nrv|| • ||C/||, and that by Lemma 3.2 ||?7|| < afn. □ 


3.3 Bounding singlnar valnes of 

We will bound H 2 iHf^Hi 2 , as in our bounds on H 22 , by splitting the matrix up according to the 
eigenspaces Ho,Hi, 112 . 


Theorem 3.4. Let ci,..., C 4 be as defined in (3.1). For the choice of a in (3.1), we have that with 
probability 1 — 0(n“®), 


^ ^ • Bo + 


3 

C2 


^2 I „2 

C 2 i- C 3 


log^ n 


cin 


1/2 


ni + 


calog^n cpog^n 
cin3/2 2 + ^^^ 1/2 


• Bn/ 


Proof. For each a,b G {0,12}, define the matrix Uab = Ba(E[B 2 i] — Lf 2 i)Bb. We can verify that 
for our choice of a the conditions of Lemma A.7 are met, and so we conclude that with probability 
1 - 0(n-5), 

^ • Qn + — • Qn 

n{a 2 — 2 a|) ai 

For X G m(2), let yu = (E[Bi 2 ] - iLi 2 )B 2 X, zu = (E[Bi 2 ] - Bi 2 )(Bo + Bi)x, xa = E[iLi 2 ]Box, 
xb = E[Bi 2 ]Bix and xc = E[iLi 2 ]B 2 X. In order to simplify our computations, we will use the 
following observation. 
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Observation 3.5. Given A G with ^4 ^ 0 and vectors xi, ... ,xt G 

T 




v*e[t] 




je[t] 


Proof. If we set yi = AAl'^Xi then the inequality reduces to, 


E 

ie[t] 


y^f < 


ie[t] 


which is an immediate consequence of triangle inequality for H-H and Cauchy-Schwartz inequality. 

□ 


Using Observation 3.5, 

x^Hf^Hf^Hux < 5 {x'^Hf^XA + + x^Ff/xc + VuMf^yu + z^Hf^zu) ■ 

To simplify the calculations and make the dominant terms apparent, let us hx ai = cily/n, 
02 = C 2 /n, as = and 04 = 04/71 wherein each c* G [ 1 / log^*^® ri, 1 ]. First, observe that 


> 


for this setting of parameters. 


01 n{a2—2a'f) 

For the terms x’^Hf^XA, x^^Hf^xs and x^Hf^xc we can write. 


XaHuXa < ||nox||^ 


--• \\UoE[H2l]Qnf + — 

n[a 2 — 2 af) ai 


\UoE[H 2 i]Q. 


- L ||2 


(a 2 - 2 af) C 2 


x^FIii^xs < ||nix||' 


, \ 2 ^ • llni E[H 2 l]Qn\\^ + — 

n[a2 — 2af) ai 


|niE[F2i]Q, 


- L ||2 


<^l|nixf < 

ai 


r 2 


xl^ffn^xc < ||n 2 x||^ 


cin ^/2 

1 


n(a 2 — 2 a^) 


IlHixf 

• ||n2E[i/2l]Qnf + — • \\n2E[H2l]Q. 

CHI 


±||2 
n II 


= 0 

ll(no + ni)xf 


n(a2 — 2af) 


\H 21 - E[H2l]f\\Qnf + — • ||i^21 - IE[^2i]f IIQ: 


- L ||2 


ai 


< 


(linoxf + iinixf) 


2 ;rT 2 


OnU 


+ 


ain^ 


n{a2 — 2af) ai 


(linoxf + iinixf) 


2 \ 4 ™ 


cin 


1/2 


Finally, we have 

vuHf^^yu < 11(^12 -E[i/i2])n2x||2 


< 

rv-/ 


< 


2 tt 2 


aZn 


+ 


a\n^ 


n{a2 — 2af) ai 


1 1 

n{a2 — 2af) ai 
,, 27^2 


Ilhtwa^lP + 


log n agTT^ log n 


n‘^{oi 2 — 2 af) 


+ 


nai 


4^og^n iiTT ii2 , cilog^n 2 


The conclusion follows by grouping the projections, and taking the dominating terms as n grows. □ 
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3.4 Proof of if '2 ^ 

Theorem 3.6. For the choice of a given in (3.1), we have that H 22 ^ with probability 

1 - 0(n-4). 

Proof. Recall that by Theorem 3.1 with probability at least 1 — 0{n~^), 

H!22h ^ -110 + ^ - ^ ■U2 + ^Uw . 

By our choice of the parameters 01 , 02503)045 the conclusion of Theorem 3.4 implies that 


W 22 - 


‘Hi2 t ( Ro - ^ 


+ 


C 2 

1 a 2 - 


Bo + 

C3 log^ n 


1 _ cl + cl log^ n 

4 ^ cin^/^ 


Bi 


cin 


3/2 


B 2 + 


1 

-/3Ao - 
16 


log^ n 


cin 


^ 0 , 


Bvr 


as desired. The details of the calculation are spelled out below for the sake of completeness; we 
verify that the coefficient of each projector is non-negative. 

For the space Bq, 


For Hi, 


For 112 , 


Finally for Iliy, 


c 


2 

3 


C2 


1 

Ao 


^ = 7 -' « 1 

C2C4 


cj + cj log^ n _ J_ ^ cl + cl log^ n 
cin^/^ Ai C3C1 


- -I- 7 ^/?^ log^ n <C 1. 

7 


C 3 log"^ n _ J_ ^ 1 C 3 log^ n 

c\n^l'^ A2 ~ C1C2 


C 3 log^ n 1 
cin^/2 /3Ao 


c| log^ n 

C 1 C 4 


p log^ n <C 1 . 


This concludes the proof. 


□ 


3.5 Proof of Main Theorem 

We finally have the components needed to prove Theorem 2.1. 

Proof of Theorem 2.1. First, we recall that independent of our choice of a, the SDP solution defined 
in Definition 2.8 does not violate any of the linear constraints of (2.1), as shown in Proposition 2.10. 
To meet the program constraints, it remains to show that for the choice of a given in (3.1), our 
solution is PSD. 

The solution matrix M'{G,a) is a principal submatrix of N'(G,a), and so M ^ 0 implies 
M' P 0. We prove that N' satisfies the Schur complement conditions from Lemma 2.2 with high 
probability. Observing that = Hu and H '^2 — -B 12 , we apply Theorem 3.6, which states that 
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for our choice of a, H 22 ^ For the Hu term, we apply the lower bound on the 

eigenvalues of i^n given by [DM15a] (Lemma A. 7), which state that so long as ai — a 2 3> a 2 n“^/^ 
and a 2 — 2af > 0, we have Hu y 0 with probability 1 — 0(n~^). For our choice of ai, a 2 , we have 


^ _P _ 7 ^^ 

ni/2 „ 3/2 ^ ^ 1/2 n 


and 


o 2 IP - 2/5 _ n 

a2 — 2ai =-i:|> 0, 

n 


and so we may conclude that Hu ^ 0. Therefore by a union bound, the conditions of Lemma 2.2 
are satisfied with probability 1 —0(n“^), and N'{G,a) ^ 0 and so our solution satisfies the PSDness 
constraint. 

It remains only to prove that the objective value is Q{y/n). The objective value is simply 
Z]ie[n] ~ ~ concluding our proof. □ 


4 Concentration of Projected Matrices 

In this section, we give bounds on the spectra of random matrices that are part of the correction 
term. Though we are able to recycle many of the spectral bounds of Deshpande and Montanari 
[DM15a], in our modification to their witness, we introduce new matrices which also require de¬ 
scription and norm bounds. 

We obtain our bounds by employing the trace power method. The trace power method uses the 
fact that if A G is a symmetric matrix, then for even powers k, Tr(A^) = Xi{X)^ > 

Amax(-^)*^- By bounding E[Tr(A^)]^/*^ for a sufficiently large k, we essentially obtain bounds on 
the infinity norm of the vector of eigenvalues, i.e., a bound on the spectral norm of the matrix X. 
A formal statement follows, and the proof is given in Appendix A.l for completeness. 

Lemma 4.1. Suppose annxn random matrix M satisfies E[Tr(M^)] < • ( 7 /c)! for any even 

integer k, where a, /3 ,7 are constants. Then 

• n" • logre^ >! — ??• 

Our concentration proofs consist of, for each matrix in question, obtaining a bound on E[Tr(A^)]. 
The expression E[Tr(A^)] is a sum over products along closed paths of length k in the entries of X. 
In our case, the entries of the random matrix X are themselves low degree polynomials in random 
variables {Aij}i^[n],je[n] where Aij is the centered random variable that indicates whether the edge 
{i,j) is part of the random graph G. Thus Tr(A^) can be written out as a polynomial in the random 
variables Since the random variables centered (i.e., E[Ajj] = 0), almost 

all of the terms in E[Tr(A^)] vanish to zero. The nonzero terms are precisely those monomials in 
which every variable appears with even multiplicity. 

For the purpose of moment calculations, we borrow much of our terminology from the work of 
Deshpande and Montanari [DM15a]. Every monomial in random variables corresponds 

to a labelled graph {F = {V,E),i) that consists of a graph F = {V, E) and a labelling £ : E —>■ [n] 
that maps its vertices to [n]. A labelling of F contributes (is nonzero in expectation), if and only 
if every pair {i,j} appears an even number of times as a label of an edge in F. The problem of 
bounding E[Tr(A^)] reduces to counting the number of the number of contributing labeled graphs. 

For example, for a given matrix X, we may have a bound on the number of “vertices” and “edges” 
in a term of E[Tr(A^)] as a function of k. In that case, we may use the following proposition, which 


16 






allows us to bound the number of such graphs in which every variable Aij, corresponding to an 
edge between vertices i and j, appears at least twice. 

Proposition 4.2. Let F = {V, E) he a multigraph and let £ : V ^ [n] be a labelling such that each 
pair {i,j) appears an even number of times as the label of an edge in E. Then, 


|{£(u)|u G P}| < + {if connected components of F) 


Proof. From F, we form a new graph F' by identifying all the nodes with the same label; thus, the 
number of nodes in F' is the number of labels in F. We then collapse the parallel edges in F' to 
form the graph F[; since each labelled edge appears at least twice, the number of edges in H is at 
most half that in F. The number of nodes in H (and thus labels in F) is at most the number of 
edges in H plus the number of connected components; this is tight when H is a forest. Thus the 
number of distinct labels in F is at most \E\/2 + c, where c is the number of components in F. □ 


We apply this lemma, as well as simple inductive arguments, to bound the number of contribut¬ 
ing terms in E[X^] for the matrices X in question, and this allows us to bound their norms. We 
give the concentration proofs the following subsection. 


4.1 Proofs of Concentration 

Let G be an instance of G(n, ^). As in the preceeding sections, define the vector Ai G so 
that Ai{j) = 1 if (f,j) G E{G), Ai{j) = —1 if (i,j) 0 E{G), and Ai{i) = 0. Again as in the 
preceeding sections, define ai,..., a„ G m( 2 ) by setting a, to be the restriction of Af^ to coordinates 
corresponding to unordered pairs, i.e., ai{{c,d}) = AicAid for all {c,d} G ( 2 ). We will continue to 
use the notation W = span^gj^j(a*), and the notation Di = diag(aj). 

We begin with a lemma that shows that the ai are close to an orthogonal basis for W. 

Lemma 4.3. If Pw '=^ Yli then with probability at least 1 — 0(ji~^), 

■^{l + 0(1)) • Pw F Hw ^ (1 - 0(1));^ • Pw 

Proof. By definition, the vectors ai,..., form a basis for the subspace W. 

Let TZ be the matrix whose fth row is a*. We will use matrix concentration to analyze the 
eigenvalues of TZTZ^, which are identical to the nonzero eigenvalues of Pw = IZ^IZ. 

The (f,j)th entry of IZIZ^ is {ai,aj) = = i(Aj, Aj)^. When i = j, this is precisely 

^(n — 1 )^, and so 2IFR7 = (n — 1 )^ • Id^ -|- B, where B is a matrix that is 0 on the diagonal and 
equal to (A®^, A®^) in the (f, entry for i / j. 

Let M = B — E[il] = B — {n — 2)(J„ — Id^i). We will use the trace power method to prove that 
||M|| = 0(n^/^). The (i, j)th entry of M is given by 0 for i = j, and when i ^ j 

AI{i,j) = (Aj, Aj) (n 2) = ( ^ ^ AipAiqAjpAjq j (u 2) = ^ ^ AipAiqAjpAjq. 

\ p,g / p¥=g 

The expression Tr(M^) is a sum over monomial products over variables {Ajp}jpg[„], where 
each monomial product corresponds to a labelling T ; T —)• [n] of a graph F. Each entry in Mij 
corresponds to a sum over links, where each link is a cycle of length 4, with the vertices i,j on 
opposite ends of the cycle, and the necessarily distinct vertices p, q are on the other opposite ends 
of a cycle. We will refer to i,j as the center vertices and p, q as the peripheral vertices of the link. 
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Each edge {u,v) of the link is weighted by Auv Since An = 0 for all i G [n], for every contributing 
labelling, it can never be the case that one oi p,q = i. Each monomial product in the summation 
Tr(M^) corresponds to a labelling {F, C) of the graph F, where E is a cycle with k links. F has Ak 
edges, and in total it has 3/c vertices. 



N_/ 


The quantity Tr(M^) is equal to the sum over all labellings of F. Taking the expectation, terms 
in E[Tr(M^)] which contain a variable A^v with multiplicity 1 have expectation 0. Thus, E[Tr(M^)] 
is equal to the number of labellings of F in which every edge appears an even number of times. 

We prove that any such contributing labelling C : F ^ \ri\ has at most 3A:/2 + 1 unique vertex 
labels. We proceed by induction on k, the length of the cycle. In the base case, we have a cycle on 
two links; by inspection no such cycle can have more than 5 labels, and the base case holds. 

Now, consider a cycle of length k. If every label appears twice, then we are done, since there 
are 3A: vertices in F. Thus there must be a vertex that appears only once. 

There can be no peripheral vertex whose label does not repeat, since the two center vertices neigh¬ 
boring a single peripheral vertex cannot have the same label in a contributing term, as M{i, i) = 0. 
Now, if there exists a center vertex i whose label does not repeat, it must be that there is a matching 
between its p, q neighbors so that every vertex is matched to a vertex of the same label; we identify 
these same-label vertices and remove i and two of its neighbors from the graph, leaving us with a 
cycle of length k — 1, having removed at most one label from the graph. The induction hypothesis 
now applies, and we have a total of at most 3{k — l)/2 -|- 2 < 3k/2 -|- 1 labels, as desired. 

Thus, there are at most 3A:/2 -|- 1 unique labels in any contributing term of E[Tr(M*^)]. We 
may thus conclude that E[Tr(M^)] < • {3k/2 + 1)^^, and applying Lemma 4.1, we have that 

||M|| < with probability at least 1 — 0{n~^). 

Therefore, 2TZ7l^ = {{n — 1)^ — n -|- 2)Id„ + {n — 2) J„ -t- M, and we may conclude that all 
eigenvalues of TlTl^ are (1 ± o(l)) • n^, which implies the same of Pw = Since the range of 

Pw and liw is the same, we finally have that with probability 1 — o(l) 

(1 o(l))/n^ • Pw P P-w ^ (1 - o{l))/n^ ■ Pw, 


as desired. 


□ 


The following lemma allows us to approximate the projector to Vo U Viby a matrix that is easy 
to describe; we will use this matrix as an approximation to the projector in later proofs. 

Lemma 4.4. Let Hoi be the projection to the vector space Vq U Vi. Let Pqi £ m( 2 )’( 2 ) be a matrix 
defined as follows: 

\{a,b,c,d}\ =2 
^ \{a,b,c,d}\ = 3 

_0 |{a, b, c,d}\ = 4. 


Poi {(^b, cd) = < 


Then 


Eoi EPoi E(^).noi, 
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Proof. We will write down a basis for Vq U Vi, take a summation over its outer products, and then 
argue that this summation approximates IIoi. The vectors vi,... ,Vn G are a basis for Vi U Vq: 


Vi{a,b) 



{a, 6} = {i,-} 
otherwise. 


For any two Vi,Vj, we have {vi,Vj) = Let U G be the matrix whose ith column is given 

by Vi- Notice that the eigenvalues of Y^^VivJ = UU'^ are equal to the eigenvalues of U'^U, and 
that U'^U = Therefore, as both matrices have the same column and row spaces, 


Hoi ^ - ^^01, 

i 

Now, let Poi = VivJ ; we can explicitly calculate the entries of Pqi, 


Poi(a6, cd) 


^ |{a,6,c,d}| = 2 

< |{a,6,c,d}| = 3 

|{a,6,c,d}| = 4. 


The conclusion follows. 


□ 


We will require the fact that W lies mostly outside of Vq U Vi, which we prove in the following 
lemma. 

Lemma 4.5. With probability at least 1 — 

linoinu^noill . 

Proof. Call M = noinn/IIoi. We will apply the trace power method to M. By Lemma 4.4 
and Lemma 4.3, we may exchange Hw for Ojuf and IIoi for Pqi. Letting M'^ = 

(we have by the cyclic property of the trace that ]E[Tr(M*^)] < Tr(M'^). 

We consider the expression for E[Tr(M'*^)]. Let a chain consists of a set of quadruples 
{a^, Q, G [n]^ such that for each (. G [fc], we have H {c£_i,d£_i}| > 1 (where 

we identify ai with ai mod k)- Let Ck denote the set of all chains of size k. We have that, 

T:(M^) < Tr(M'^) = JJ i±^ . 

{ae,bi,Ci,de}e^ik]eCk ^=l 

where r^ = or depending on whether one or both of a£, b^ are common with the following 
link in the chain. The quantity Tr(M^) consists of cycles of k links, each link is a star on 4 outer 
vertices ai,b£,C£, dg with center vertex and the non-central vertices of the link mnst have at least 
one vertex in common with the next link, so each link has 4 edges and the cycle is a connected 
graph. See the figure below for an illustration (dashed lines indicate vertex equality, and are not 
edges). 




Q-i 

ae ^ 


a -- 



ce+i 


ii-i 






H+i 


bi-i 


di-i 

be ^ 


de 

be+i 


de+i 




N _ 


_/ 
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Each term in the product has a factor of at most due to the scaling of the entries of 

Poi and Pw- Thus we have 


E[Tr(M'*)]<(^ E E 


E 


h,-,ik {ae,be,ce,d(}i^[k]^Cn u=l 


,ai^ 


ii,di 


The only contributing terms correspond to those for which every edge variable in the product has 
even multiplicity. Each contributing term is a connected graph and has 4A: edges and at most 5k 
vertices where every labeled edge appears twice, so we may apply Proposition 4.2 to conclude that 
there are at most 2fc + 1 labels in any such cycle. We thus have that 

E[Tr(M'^)] < • (5A;)!, 

and applying Lemma 4.1, we conclude that ||M|| < ^ with probability 1 — as 

desired. □ 

We combine the above lemmas to bound the norm of one final matrix that arises in the compu¬ 
tations in Theorem 3.1. 

Lemma 4.6. With probability 1 — 0{n~^), 

Di^Il 2 UwP- 2 Di^ E 0 {n) ■ Ho + 0(log^ n) • Id^. 

W 

Proof. We begin by replacing 112 with (1 — Hoi), as by Lemma 4.5, Hoi can be replaced by Pqi 
which has a convenient form. Eor any vector x € r( 2 )j 

x'^ DiI{2Pw^2D^ x = DiliwD^ X - 2 x'^ AHoiLlw x + x'^ AnoiLlwnoi x 

— ^ (IlnrvAa^ll^ + 2||nn/noi A3;|| • ||nu/Aa^|| + UnwHoi Aa^ll^) 


— ^ DJlwDi j ® + 2 I '^{DixYliQiIiw^oiDi 


< 2x'^ ( 'y ^ AErv A j -|- 2n UnoiLliyLli 


oill • Al 


where to obtain the second line we have applied Cauchy-Schwarz, to obtain the third line we have 
used the fact that a'^ + b'^ > 2 ab, and to obtain the final line we have used the fact that ||Aa^|| = ||a^||- 
Now, the second term is O(log^n) • ||x|p with overwhelming probability by Lemma 4.5. It 
remains to bound the first term. To this end, we apply Lemma 4.3 to replace with = 

■ Let M = -fz ■ DiPwDi. An entry of M has the form 


i¥=j 
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Thus we can see that M = -J/n\ + BB^, where Jfn\ is the all-ones matrix in ^(2)^(2) and 

" (2) ^ ’ (2) 

B is the matrix whose entries have the form 

5(0.6, ij) — 

The matrix B is actually equal to the matrix J 4 1 from [DM15a], and by Lemma A.3 has ||5|| < n 
with probability 1 —0(n“^). We can thus conclude that with probability 1 —0(n“^), ||M —^ || < 

< 0 ( 1 ), and so x^Mx < l('n^)^+x’^(M —n“^ J)x < 0 (n) • ||nox|p+ 0 ( 1 ) • ||x|p, 

which gives the desired result. □ 
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A Matrix Norm Bounds from Deshpande and Montanari 


In this appendix, we give for completeness a list of the bounds proven by Deshpande and Montanari 
[DM15a] that were not included in the body above out of space or expository considerations. 

Definition A.l. Let A = {a, 6} C [n] be disjoint from B = {c,d} C [n]. For r] G {1,... ,4} and 
for v{ri) G [(^)] we define the matrices follows: 


JlM,B)=Aac 
J[2{A,B)=Aad 
Jj^,iA,B) = A,, 
4 (A, B) = Am 


~ -^ac-^bd 
^ 2 , 2 ^-^'! ^') ~ -^ac-^hc 
•^2, 3 (^ 5 ~ -^ac-^ad 
•^2, 4 (^ 5 '®) ~ -‘^ad-^bd 
•^2, 5 (^ 5 '®) ~ -^bc-^bd 
'^ 2 , 6 (^>'®) ~ -^ad-^bc 


'^ 3,1 ~ -^ac^ad-^bc 

'^ 3 , 2 (^)'®) ~ -^ac-^bcAbd 
'^ 3 , 3 (^>'®) ~ -^ac-^ad-^bd 
'^ 3 , 4 (^>'®) ~ -^ad-^bcAbd 


— -^ac-^ad-^bc-^bd 


And further, letting V : 1 ^( 2 )^( 2 ) to be the matrix projector such that 


{VM)a,b 


Ma,b |A U .B| = 4 
0 otherwise, 


we define and finally we define = pga 4 • J'rj^y (as in Deshpande and Montanari), 

so that Q = Y.ti=i Y^li Jri,y 

Notice that since we have defined An = 0 and since |{a,&}| = |{c, d}| = 2, we have Jaa = '^4,1- 
For some of the terms, the J is never considered; however for some terms it is cleaner to bound 
the spectral norm of J in the subspace V 2 , and so Deshpande and Montanari provide trace power 
method bounds on the difference in norm: 

Lemma A.2 (Lemma 4.26 in [DM15b]). With probability at least 1 — 6n“®, for each t] <2 and for 
each u < (^), 

^ Q:4n. 


Deshpande and Montanari use the trace power method to bound the norm of Q by bounding 
the norms of the Jr^^y individually. Some of the Jr^^y matrices have Wigner-like behavior. 

Lemma A.3 (Lemmas 4.21, 4.22 in [DM15b]). With probability 1 — 0{n~^), we have that for each 
(r?,i/) G{(2,1),(2,6),(3,-),(4,1)}, 

IIII ^ *^4 ■ 

A select few of the Jri,y have larger eigenvalues. 

Lemma A. 4 (Lemmas 4.23, 4.24 in [DMlSb]). With probability 1 — 0(n“^), we have that for each 
(r?,i/) G{(1,-),(2,2),(2,3),(2,4),(2,5)}, 

ll'^y.i^ll < 04 • 

We also give a short proof of an observation of Deshpande and Montanari, which states that 
some of the Jr^^y vanish when projected to V 2 : 
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Observation A.5 (Lemmas 4.23, 4.24 in [DM15a]). Let 112 be the projector to V 2 . Then always, 


U 2 



= 0 , and similarly. 


n 2 (J 2,3 + -^2,5)11 = 0 . 


Proof. The proof follows from noting that the range of both of these sums of Jn^u is in Li- Consider 
some vector v G M^z); let v' We will look at the entry of v' indexed by the 

disjoint pair A = {a, b}. By definition of the we have that 


v'a — ( {^a,c + ^a,d) + (^6,c + 

c,dE[n] 


^^(Aa,c + Aa^d)Vc,d 1 T ( ^ + ^b,d) 


Vc,d , 


c^d 


c^d 


and so by the characterization of Vi from Proposition 2.3 the vector v' G Vi. The conclusion follows. 
A similar proof holds for the matrix J 2,3 + J 2 , 5 . □ 

Finally, we use a bound on the norm of the matrix K, which is the difference of Lf 2,2 and the 
non-multilinear entries. 


Lemma A.6 (Lemma 4.25 in [DM15b]). Let K be the restriction of ^^ 2,2 ~ ^[^^ 2 , 2 ] to entries 
indexed by sets of size at most 3. With probability at least 1 — n~^), 

||iL|| < 0(a3n^/^). 

We also require bounds on the matrices used in the Schur complement steps. The bounds of 
Deshpande and Montanari suffice for us, since we do not modify moments of order less than 4. 

Lemma A.7 (Consequence of Proposition 4.19 in [DM15b]). Define Qn G M”’’"' to be the orthogonal 
projection to the space spanned by 1 . Suppose that a satisfies a\—a 2 > D,{a 2 n~^^‘^) and a2—2a1 > 0, 
ai > 0. Then with probability at least 1 — n~^, 


Hip h 0 


Hfl < 


n{a2P - ajp) 


Qn H- Q 

CXl 


± 

n • 


A.l Additional Proofs 

We prove Lemma 2.5, which follows almost immediately from the bounds of [DM15a]. 

Proof of Lemma 2.5. Using the matrices from Definition A.l and Observation A.5, we have that 

n2<3 = n2( J2,4 + <^2, 2 ) + n2 (yJ2p — <^ 2,4 + J2,2 — J2,2^ + Li2 + <^4,1^ , 

Il2^wQ = n2nvy(J2,4 + J2,2) + Lf2nn^ ^J2,4 — J2,4 + J2,2 “ J2,2^ + n2nw ^3,1/ + <A,1^ , 
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where we have used the fact that the columns of J 2,4 and J 2,2 he in W. We thus have 

112(5 — II 2 UWQ = n2(/ — Hw) iyJ2,2 — J2,2 + </2,4 “ J2,4 + ^ , 

And by the bounds II II < a 4 -n and || J 4 ^i|| < a4-re from Lemma A.3 and the bounds || J2,4—J2,4|| < 
a 4 re and || J 2,2 ~ <^2,211 < from Lemma A.2, and because n 2 (/ — ^w) is a projection, the 
conclusion follows. □ 

Now, we prove that the trace power method works, for completeness. 

Proof of Lemma 4-L The proof follows from an application of Markov’s inequality. We have that 
for even k, 


P[||M|| > t] = P[||M'=|| > t^] 

< P[Tr(M^) > t'^] 

< ^E[Tr(M'’)] 


where we have applied Stirling’s approximation in the last step. Choosing k 
t = O ■ 7 • log re • re") completes the proof. 


(9(log re) and 
□ 
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